SemMT: A Semantic-Based Testing Approach for Machine Translation Systems
نویسندگان
چکیده
Machine translation has wide applications in daily life. In mission-critical such as translating official documents, incorrect can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine systems. Existing mostly rely metamorphic relations designed at the textual level (e.g., Levenshtein distance) syntactic distance between grammar structures) to determine correctness of results. However, these do not consider whether original and translated sentences same meaning (i.e., Semantic similarity). Therefore, this paper, we propose SemMT, an automatic approach systems based semantic similarity checking. SemMT applies round-trip measures sentences. Our insight is that semantics expressed by logic numeric constraint be captured using regular expressions (or deterministic finite automata) where efficient equivalence/similarity checking algorithms are available. Leveraging insight, three metrics implement them SemMT. The experiment result reveals achieve higher effectiveness compared with state-of-the-art works, achieving increase 21% 23% accuracy F-Score, respectively. We also explore potential improvements achieved when proper combinations adopted. Finally, discuss a solution locate suspicious trip translation, which may shed lights further exploration.
منابع مشابه
Semantic Web based Machine Translation
This paper describes the experimental combination of traditional Natural Language Processing (NLP) technology with the Semantic Web building stack in order to extend the expert knowledge required for a Machine Translation (MT) task. Therefore, we first give a short introduction in the state of the art of MT and the Semantic Web and discuss the problem of disambiguation being one of the common c...
متن کاملKnowledge-Based Semantic Embedding for Machine Translation
In this paper, with the help of knowledge base, we build and formulate a semantic space to connect the source and target languages, and apply it to the sequence-to-sequence framework to propose a Knowledge-Based Semantic Embedding (KBSE) method. In our KBSE method, the source sentence is firstly mapped into a knowledge based semantic space, and the target sentence is generated using a recurrent...
متن کاملA scalarization-based method for multiple part-type scheduling of two-machine robotic systems with non-destructive testing technologies
This paper analyzes the performance of a robotic system with two machines in which machines are configured in a circular layout and produce non-identical parts repetitively. The non-destructive testing (NDT) is performed by a stationary robotic arm located in the center of the circle, or a cluster tool. The robotic arm integrates multiple tasks, mainly the NDT of the part and its transition bet...
متن کاملA Bilingual Graph-Based Semantic Model for Statistical Machine Translation
Rui Wang, Hai Zhao,1,2⇤ Sabine Ploux, ⇤ Bao-Liang Lu, and Masao Utiyama Department of Computer Science and Eng. Key Lab of Shanghai Education Commission for Intelligent Interaction and Cognitive Eng. Shanghai Jiao Tong University, Shanghai, China Centre National de la Recherche Scientifique, CNRS-L2C2, France National Institute of Information and Communications Technology, Kyoto, Japan wangrui....
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Software Engineering and Methodology
سال: 2022
ISSN: ['1049-331X', '1557-7392']
DOI: https://doi.org/10.1145/3490488